Overview

Dataset statistics

Number of variables11
Number of observations5473
Missing cells0
Missing cells (%)0.0%
Duplicate rows67
Duplicate rows (%)1.2%
Total size in memory470.5 KiB
Average record size in memory88.0 B

Variable types

NUM11

Reproduction

Analysis started2020-08-25 01:43:04.749480
Analysis finished2020-08-25 01:43:24.197052
Duration19.45 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

Dataset has 67 (1.2%) duplicate rows Duplicates
blackand is highly correlated with blackpixHigh correlation
blackpix is highly correlated with blackandHigh correlation
height is highly skewed (γ1 = 20.37000095) Skewed
mean_tr is highly skewed (γ1 = 67.42540616) Skewed

Variables

height
Real number (ℝ≥0)

SKEWED

Distinct count104
Unique (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.473232230951947
Minimum1.0
Maximum804.0
Zeros0
Zeros (%)0.0%
Memory size42.9 KiB
2020-08-25T01:43:24.240534image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q17
median8
Q310
95-th percentile20
Maximum804
Range803
Interquartile range (IQR)3

Descriptive statistics

Standard deviation18.96056392
Coefficient of variation (CV)1.810383222
Kurtosis659.6587546
Mean10.47323223
Median Absolute Deviation (MAD)2
Skewness20.37000095
Sum57320
Variance359.502984
2020-08-25T01:43:24.359621image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
894017.2%
 
990316.5%
 
783115.2%
 
1063011.5%
 
63516.4%
 
113266.0%
 
52715.0%
 
12544.6%
 
121643.0%
 
131031.9%
 
14891.6%
 
2871.6%
 
4641.2%
 
3581.1%
 
16460.8%
 
24390.7%
 
17350.6%
 
25330.6%
 
15200.4%
 
23160.3%
 
22150.3%
 
28150.3%
 
19110.2%
 
18100.2%
 
38100.2%
 
Other values (79)1522.8%
 
ValueCountFrequency (%) 
12544.6%
 
2871.6%
 
3581.1%
 
4641.2%
 
52715.0%
 
63516.4%
 
783115.2%
 
894017.2%
 
990316.5%
 
1063011.5%
 
ValueCountFrequency (%) 
8041< 0.1%
 
4301< 0.1%
 
3111< 0.1%
 
3061< 0.1%
 
3041< 0.1%
 
2611< 0.1%
 
2121< 0.1%
 
1971< 0.1%
 
1872< 0.1%
 
1861< 0.1%
 

lenght
Real number (ℝ≥0)

Distinct count452
Unique (%)8.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean89.56824410743651
Minimum1.0
Maximum553.0
Zeros0
Zeros (%)0.0%
Memory size42.9 KiB
2020-08-25T01:43:24.481156image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6
Q117
median41
Q3107
95-th percentile346
Maximum553
Range552
Interquartile range (IQR)90

Descriptive statistics

Standard deviation114.7217575
Coefficient of variation (CV)1.280830708
Kurtosis4.203579984
Mean89.56824411
Median Absolute Deviation (MAD)30
Skewness2.104048249
Sum490207
Variance13161.08164
2020-08-25T01:43:24.592066image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
121262.3%
 
131182.2%
 
141132.1%
 
71122.0%
 
81051.9%
 
111051.9%
 
91031.9%
 
191021.9%
 
18961.8%
 
20901.6%
 
17891.6%
 
16831.5%
 
4731.3%
 
22681.2%
 
21671.2%
 
23661.2%
 
1651.2%
 
10641.2%
 
15631.2%
 
32621.1%
 
6591.1%
 
33571.0%
 
41541.0%
 
25541.0%
 
26531.0%
 
Other values (427)342662.6%
 
ValueCountFrequency (%) 
1651.2%
 
2280.5%
 
3330.6%
 
4731.3%
 
5450.8%
 
6591.1%
 
71122.0%
 
81051.9%
 
91031.9%
 
10641.2%
 
ValueCountFrequency (%) 
5531< 0.1%
 
5521< 0.1%
 
5502< 0.1%
 
5471< 0.1%
 
5441< 0.1%
 
5411< 0.1%
 
53840.1%
 
537130.2%
 
536120.2%
 
535220.4%
 

area
Real number (ℝ≥0)

Distinct count1395
Unique (%)25.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1198.4056276265303
Minimum7.0
Maximum143993.0
Zeros0
Zeros (%)0.0%
Memory size42.9 KiB
2020-08-25T01:43:24.709900image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum7
5-th percentile29
Q1114
median322
Q3980
95-th percentile4590
Maximum143993
Range143986
Interquartile range (IQR)866

Descriptive statistics

Standard deviation4849.37695
Coefficient of variation (CV)4.046523847
Kurtosis484.7553928
Mean1198.405628
Median Absolute Deviation (MAD)252
Skewness19.52376995
Sum6558874
Variance23516456.8
2020-08-25T01:43:24.834464image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
96651.2%
 
77510.9%
 
112490.9%
 
42440.8%
 
56420.8%
 
72420.8%
 
120420.8%
 
180400.7%
 
98390.7%
 
91360.7%
 
40350.6%
 
35340.6%
 
168340.6%
 
126330.6%
 
200310.6%
 
128310.6%
 
70300.5%
 
144300.5%
 
63300.5%
 
576290.5%
 
84290.5%
 
140290.5%
 
80290.5%
 
78280.5%
 
48280.5%
 
Other values (1370)456383.4%
 
ValueCountFrequency (%) 
7110.2%
 
8180.3%
 
9240.4%
 
10170.3%
 
1160.1%
 
12210.4%
 
13100.2%
 
14190.3%
 
15140.3%
 
1680.1%
 
ValueCountFrequency (%) 
1439931< 0.1%
 
1422901< 0.1%
 
1407521< 0.1%
 
983681< 0.1%
 
872341< 0.1%
 
819541< 0.1%
 
783521< 0.1%
 
722041< 0.1%
 
676261< 0.1%
 
457601< 0.1%
 

eccen
Real number (ℝ≥0)

Distinct count1511
Unique (%)27.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.753976977891467
Minimum0.006999999999999999
Maximum537.0
Zeros0
Zeros (%)0.0%
Memory size42.9 KiB
2020-08-25T01:43:24.966720image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0.007
5-th percentile0.769
Q12.143
median5.167
Q313.625
95-th percentile46
Maximum537
Range536.993
Interquartile range (IQR)11.482

Descriptive statistics

Standard deviation30.70373722
Coefficient of variation (CV)2.232353396
Kurtosis59.00986276
Mean13.75397698
Median Absolute Deviation (MAD)3.678
Skewness6.71721425
Sum75275.516
Variance942.719479
2020-08-25T01:43:25.069517image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
21202.2%
 
1991.8%
 
1.5681.2%
 
3651.2%
 
4531.0%
 
1.571460.8%
 
5430.8%
 
1.333360.7%
 
1.857350.6%
 
6350.6%
 
1.75300.5%
 
2.333300.5%
 
2.5290.5%
 
2.167290.5%
 
9290.5%
 
8280.5%
 
3.5270.5%
 
10270.5%
 
1.143260.5%
 
1.125250.5%
 
7250.5%
 
1.286240.4%
 
16240.4%
 
0.667240.4%
 
1.4240.4%
 
Other values (1486)447281.7%
 
ValueCountFrequency (%) 
0.0071< 0.1%
 
0.0091< 0.1%
 
0.0122< 0.1%
 
0.0131< 0.1%
 
0.0141< 0.1%
 
0.01940.1%
 
0.0212< 0.1%
 
0.0241< 0.1%
 
0.0262< 0.1%
 
0.0272< 0.1%
 
ValueCountFrequency (%) 
5371< 0.1%
 
4131< 0.1%
 
3791< 0.1%
 
2881< 0.1%
 
2831< 0.1%
 
2791< 0.1%
 
2781< 0.1%
 
2772< 0.1%
 
26930.1%
 
268.5100.2%
 

p_black
Real number (ℝ≥0)

Distinct count711
Unique (%)13.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3686424264571533
Minimum0.052000000000000005
Maximum1.0
Zeros0
Zeros (%)0.0%
Memory size42.9 KiB
2020-08-25T01:43:25.179350image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0.052
5-th percentile0.156
Q10.261
median0.337
Q30.426
95-th percentile0.7762
Maximum1
Range0.948
Interquartile range (IQR)0.165

Descriptive statistics

Standard deviation0.1777567501
Coefficient of variation (CV)0.4821928714
Kurtosis3.35189289
Mean0.3686424265
Median Absolute Deviation (MAD)0.08
Skewness1.63008287
Sum2017.58
Variance0.03159746221
2020-08-25T01:43:25.293755image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
11061.9%
 
0.286581.1%
 
0.375531.0%
 
0.333521.0%
 
0.4380.7%
 
0.357380.7%
 
0.5370.7%
 
0.25340.6%
 
0.3330.6%
 
0.292280.5%
 
0.371270.5%
 
0.306270.5%
 
0.329270.5%
 
0.417270.5%
 
0.35260.5%
 
0.429260.5%
 
0.356260.5%
 
0.36250.5%
 
0.294250.5%
 
0.321250.5%
 
0.268250.5%
 
0.339240.4%
 
0.364240.4%
 
0.298240.4%
 
0.361240.4%
 
Other values (686)461484.3%
 
ValueCountFrequency (%) 
0.05230.1%
 
0.05530.1%
 
0.0561< 0.1%
 
0.0571< 0.1%
 
0.0591< 0.1%
 
0.061< 0.1%
 
0.06330.1%
 
0.0652< 0.1%
 
0.0671< 0.1%
 
0.071< 0.1%
 
ValueCountFrequency (%) 
11061.9%
 
0.9981< 0.1%
 
0.9941< 0.1%
 
0.99330.1%
 
0.9921< 0.1%
 
0.992< 0.1%
 
0.9861< 0.1%
 
0.9851< 0.1%
 
0.9841< 0.1%
 
0.9831< 0.1%
 

p_and
Real number (ℝ≥0)

Distinct count700
Unique (%)12.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.7850526219623607
Minimum0.062
Maximum1.0
Zeros0
Zeros (%)0.0%
Memory size42.9 KiB
2020-08-25T01:43:25.421351image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0.062
5-th percentile0.484
Q10.679
median0.803
Q30.927
95-th percentile1
Maximum1
Range0.938
Interquartile range (IQR)0.248

Descriptive statistics

Standard deviation0.1706612788
Coefficient of variation (CV)0.2173883305
Kurtosis0.6919281143
Mean0.785052622
Median Absolute Deviation (MAD)0.124
Skewness-0.8347174978
Sum4296.593
Variance0.02912527208
2020-08-25T01:43:25.524734image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
163911.7%
 
0.778340.6%
 
0.75290.5%
 
0.786290.5%
 
0.857280.5%
 
0.667250.5%
 
0.889240.4%
 
0.8240.4%
 
0.875230.4%
 
0.792230.4%
 
0.886230.4%
 
0.958230.4%
 
0.768220.4%
 
0.933210.4%
 
0.927210.4%
 
0.938210.4%
 
0.929200.4%
 
0.85200.4%
 
0.818200.4%
 
0.813200.4%
 
0.939190.3%
 
0.833190.3%
 
0.872180.3%
 
0.688180.3%
 
0.844180.3%
 
Other values (675)429278.4%
 
ValueCountFrequency (%) 
0.0621< 0.1%
 
0.0661< 0.1%
 
0.071< 0.1%
 
0.0711< 0.1%
 
0.0891< 0.1%
 
0.091< 0.1%
 
0.0941< 0.1%
 
0.0952< 0.1%
 
0.1051< 0.1%
 
0.1091< 0.1%
 
ValueCountFrequency (%) 
163911.7%
 
0.99930.1%
 
0.99840.1%
 
0.99770.1%
 
0.99680.1%
 
0.99530.1%
 
0.99450.1%
 
0.99370.1%
 
0.99270.1%
 
0.991160.3%
 

mean_tr
Real number (ℝ≥0)

SKEWED

Distinct count851
Unique (%)15.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.219278275169011
Minimum1.0
Maximum4955.0
Zeros0
Zeros (%)0.0%
Memory size42.9 KiB
2020-08-25T01:43:25.646852image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.27
Q11.61
median2.07
Q33
95-th percentile16
Maximum4955
Range4954
Interquartile range (IQR)1.39

Descriptive statistics

Standard deviation69.07902063
Coefficient of variation (CV)11.10724067
Kurtosis4816.887815
Mean6.219278275
Median Absolute Deviation (MAD)0.59
Skewness67.42540616
Sum34038.11
Variance4771.911091
2020-08-25T01:43:25.754408image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2931.7%
 
1.38591.1%
 
1.4510.9%
 
1.36470.9%
 
1.33460.8%
 
1.93430.8%
 
1.5430.8%
 
7430.8%
 
1.71430.8%
 
3420.8%
 
1.83420.8%
 
2.2410.7%
 
1.75400.7%
 
1.35400.7%
 
1.29390.7%
 
1.67380.7%
 
8370.7%
 
1.57360.7%
 
1.41360.7%
 
1.58350.6%
 
2.33350.6%
 
1.46350.6%
 
1.63340.6%
 
1.64340.6%
 
1.69340.6%
 
Other values (826)440780.5%
 
ValueCountFrequency (%) 
130.1%
 
1.0240.1%
 
1.0350.1%
 
1.0430.1%
 
1.0560.1%
 
1.0640.1%
 
1.07110.2%
 
1.0850.1%
 
1.0940.1%
 
1.1160.1%
 
ValueCountFrequency (%) 
49551< 0.1%
 
5371< 0.1%
 
4121< 0.1%
 
2771< 0.1%
 
2571< 0.1%
 
245.831< 0.1%
 
2141< 0.1%
 
2071< 0.1%
 
2041< 0.1%
 
1971< 0.1%
 

blackpix
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count1069
Unique (%)19.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean365.9307509592545
Minimum7.0
Maximum33017.0
Zeros0
Zeros (%)0.0%
Memory size42.9 KiB
2020-08-25T01:43:25.867891image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum7
5-th percentile12
Q142
median108
Q3284
95-th percentile1233.2
Maximum33017
Range33010
Interquartile range (IQR)242

Descriptive statistics

Standard deviation1270.333082
Coefficient of variation (CV)3.471512243
Kurtosis251.7609965
Mean365.930751
Median Absolute Deviation (MAD)81
Skewness13.52910554
Sum2002739
Variance1613746.139
2020-08-25T01:43:25.987072image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
7641.2%
 
8621.1%
 
9521.0%
 
15500.9%
 
13480.9%
 
27470.9%
 
14450.8%
 
11450.8%
 
28440.8%
 
31440.8%
 
24430.8%
 
42420.8%
 
20420.8%
 
22420.8%
 
47420.8%
 
23410.7%
 
46390.7%
 
16390.7%
 
53380.7%
 
26380.7%
 
38370.7%
 
19370.7%
 
12370.7%
 
33360.7%
 
60360.7%
 
Other values (1044)438380.1%
 
ValueCountFrequency (%) 
7641.2%
 
8621.1%
 
9521.0%
 
10320.6%
 
11450.8%
 
12370.7%
 
13480.9%
 
14450.8%
 
15500.9%
 
16390.7%
 
ValueCountFrequency (%) 
330171< 0.1%
 
280931< 0.1%
 
278201< 0.1%
 
266931< 0.1%
 
230251< 0.1%
 
194301< 0.1%
 
185911< 0.1%
 
177211< 0.1%
 
142921< 0.1%
 
141801< 0.1%
 

blackand
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count1718
Unique (%)31.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean741.1081673670748
Minimum7.0
Maximum46133.0
Zeros0
Zeros (%)0.0%
Memory size42.9 KiB
2020-08-25T01:43:26.103606image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum7
5-th percentile24
Q195
median250
Q3718
95-th percentile2867.8
Maximum46133
Range46126
Interquartile range (IQR)623

Descriptive statistics

Standard deviation1881.504302
Coefficient of variation (CV)2.538771511
Kurtosis187.0185333
Mean741.1081674
Median Absolute Deviation (MAD)194
Skewness11.14507385
Sum4056085
Variance3540058.438
2020-08-25T01:43:26.202117image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
89280.5%
 
77270.5%
 
8250.5%
 
35250.5%
 
56250.5%
 
72250.5%
 
42250.5%
 
88230.4%
 
54220.4%
 
76220.4%
 
9220.4%
 
7220.4%
 
84220.4%
 
70220.4%
 
71210.4%
 
40210.4%
 
14200.4%
 
36200.4%
 
108200.4%
 
96190.3%
 
49190.3%
 
75190.3%
 
43190.3%
 
44190.3%
 
110190.3%
 
Other values (1693)492289.9%
 
ValueCountFrequency (%) 
7220.4%
 
8250.5%
 
9220.4%
 
10180.3%
 
11150.3%
 
12190.3%
 
13180.3%
 
14200.4%
 
15160.3%
 
16170.3%
 
ValueCountFrequency (%) 
461331< 0.1%
 
428211< 0.1%
 
354991< 0.1%
 
348741< 0.1%
 
254001< 0.1%
 
251631< 0.1%
 
235471< 0.1%
 
234571< 0.1%
 
233011< 0.1%
 
230921< 0.1%
 

wb_trans
Real number (ℝ≥0)

Distinct count581
Unique (%)10.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean106.6628905536269
Minimum1.0
Maximum3212.0
Zeros0
Zeros (%)0.0%
Memory size42.9 KiB
2020-08-25T01:43:26.305778image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q117
median49
Q3126
95-th percentile408.4
Maximum3212
Range3211
Interquartile range (IQR)109

Descriptive statistics

Standard deviation167.3083617
Coefficient of variation (CV)1.56857142
Kurtosis60.95793459
Mean106.6628906
Median Absolute Deviation (MAD)38
Skewness5.504535353
Sum583766
Variance27992.08791
2020-08-25T01:43:26.414983image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
12564.7%
 
14911.7%
 
6841.5%
 
2791.4%
 
11771.4%
 
3751.4%
 
8751.4%
 
9741.4%
 
12741.4%
 
15711.3%
 
4711.3%
 
21701.3%
 
13681.2%
 
10671.2%
 
19661.2%
 
18661.2%
 
20631.2%
 
16631.2%
 
7621.1%
 
17611.1%
 
27591.1%
 
26551.0%
 
5551.0%
 
31510.9%
 
23510.9%
 
Other values (556)358965.6%
 
ValueCountFrequency (%) 
12564.7%
 
2791.4%
 
3751.4%
 
4711.3%
 
5551.0%
 
6841.5%
 
7621.1%
 
8751.4%
 
9741.4%
 
10671.2%
 
ValueCountFrequency (%) 
32121< 0.1%
 
29251< 0.1%
 
23331< 0.1%
 
22731< 0.1%
 
18151< 0.1%
 
17561< 0.1%
 
16511< 0.1%
 
16441< 0.1%
 
16411< 0.1%
 
16341< 0.1%
 

target
Real number (ℝ≥0)

Distinct count5
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.202631098118034
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size42.9 KiB
2020-08-25T01:43:26.526936image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile2
Maximum5
Range4
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.7214702987
Coefficient of variation (CV)0.5999098974
Kurtosis17.07645268
Mean1.202631098
Median Absolute Deviation (MAD)0
Skewness4.143712176
Sum6582
Variance0.5205193918
2020-08-25T01:43:26.635895image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1491389.8%
 
23296.0%
 
51152.1%
 
4881.6%
 
3280.5%
 
ValueCountFrequency (%) 
1491389.8%
 
23296.0%
 
3280.5%
 
4881.6%
 
51152.1%
 
ValueCountFrequency (%) 
51152.1%
 
4881.6%
 
3280.5%
 
23296.0%
 
1491389.8%
 

Interactions

2020-08-25T01:43:05.331469image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:05.496069image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:05.645876image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:05.807794image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:05.957384image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:06.103572image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:06.250846image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:06.408432image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:06.573168image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:06.721208image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:06.877877image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:07.039978image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:07.191670image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:07.331395image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:07.475157image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:07.628922image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:07.762613image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:07.894056image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:08.032244image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:08.175335image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:08.309290image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:08.453557image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:08.603950image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:08.771545image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:08.919923image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:09.097252image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:09.445166image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:09.588655image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:09.735173image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:09.883205image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:10.037660image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:10.181548image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:10.333032image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:10.490591image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:10.638602image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:10.774573image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:10.919598image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:11.057051image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:11.198256image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:11.331019image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:11.467198image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:11.610736image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:11.746649image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:11.887899image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:12.036008image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:12.178893image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:12.312177image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:12.452283image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:12.583302image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:12.711621image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:12.841111image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:12.975145image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:13.115484image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:13.246249image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:13.390002image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:13.554388image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:13.708297image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:13.859453image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:14.224680image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:14.362243image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:14.490882image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:14.618753image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:14.751564image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:14.889951image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:15.019609image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:15.156150image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:15.298585image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:15.453335image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:15.590428image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:15.734778image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:15.872700image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:16.007158image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:16.137647image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:16.271974image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:16.413977image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:16.545929image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:16.684366image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:16.831968image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:16.987484image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:17.129417image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:17.279488image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:17.422245image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:17.560744image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:17.697801image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:17.848959image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:17.997839image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:18.137165image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:18.283687image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:18.438801image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:18.581891image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:18.912421image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:19.061847image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:19.195236image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:19.323032image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:19.451467image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:19.584523image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:19.722499image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:19.850354image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:19.986600image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:20.132615image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:20.286801image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:20.430480image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:20.582850image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:20.722062image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:20.861156image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:21.003521image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:21.144567image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:21.292573image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:21.433334image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:21.578870image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:21.731673image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:21.896056image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:22.049294image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:22.209824image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:22.361923image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:22.510442image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:22.658858image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:22.810619image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:22.973213image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:23.123266image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:23.279361image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Correlations

2020-08-25T01:43:26.770672image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-08-25T01:43:27.189331image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-08-25T01:43:27.427102image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-08-25T01:43:27.666763image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2020-08-25T01:43:23.766742image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:43:24.058100image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Sample

First rows

heightlenghtareaeccenp_blackp_andmean_trblackpixblackandwb_transtarget
05.07.035.01.4000.4000.6572.3314.023.06.01
16.07.042.01.1670.4290.8813.6018.037.05.01
26.018.0108.03.0000.2870.7414.4331.080.07.01
35.07.035.01.4000.3710.7434.3313.026.03.01
46.03.018.00.5000.5000.9442.259.017.04.01
55.08.040.01.6000.5501.0002.4422.040.09.01
66.04.024.00.6670.4170.7082.5010.017.04.01
75.06.030.01.2000.3330.33310.0010.010.01.01
85.05.025.01.0000.4000.52010.0010.013.01.01
95.07.035.01.4000.4860.9148.5017.032.02.01

Last rows

heightlenghtareaeccenp_blackp_andmean_trblackpixblackandwb_transtarget
546311.0193.02123.017.5450.2430.6831.59516.01451.0325.01
54641.020.020.020.0000.8501.0008.5017.020.02.02
54651.087.087.087.0000.9201.00016.0080.087.05.02
54661.0279.0279.0279.0000.9641.00038.43269.0279.07.02
54671.016.016.016.0001.0001.00016.0016.016.01.02
54684.0524.02096.0131.0000.5420.60340.571136.01264.028.02
54697.04.028.00.5710.7140.92910.0020.026.02.01
54706.095.0570.015.8330.3000.9111.64171.0519.0104.01
54717.041.0287.05.8570.2130.8011.3661.0230.045.01
54728.01.08.00.1251.0001.0008.008.08.01.04

Duplicate rows

Most frequent

heightlenghtareaeccenp_blackp_andmean_trblackpixblackandwb_transtargetcount
309.01.09.00.1111.0001.0009.09.09.01.048
268.01.08.00.1251.0001.0008.08.08.01.046
21.08.08.08.0001.0001.0008.08.08.01.025
143.03.09.01.0000.7780.7787.07.07.01.015
257.01.07.00.1431.0001.0007.07.07.01.045
153.03.09.01.0000.8890.8898.08.08.01.014
3413.01.013.00.0771.0001.00013.013.013.01.044
01.07.07.07.0001.0001.0007.07.07.01.013
31.010.010.010.0001.0001.00010.010.010.01.023
81.014.014.014.0001.0001.00014.014.014.01.023